



























## **Agilent Technologies**

# Effective Ways of Testing BER on 40Gb/s High-Speed Interface Devices

November 6, 2001

presented by:

Lee Nichols Boyd Troop

### Outline

- Introduction
- Measurement Approaches for Digital Communications
- Simulating Real Data Streams
- Measuring Bit Errors
- 40 Gb/s Waveforms and Technology
- Flexible BERT Design
- Device Testing Architectures
- Conclusion

Page 15

**Agilent Technologies** 





In a carrier multiplexed system, signal quality is typically assessed channel-by-channel using spectral measurements of signal power versus noise power because such systems are inherently narrowband. Spectral measurements can offer detailed performance insights and align well with R&D testing objectives.

In a high-rate baseband system where the spectral content of a single high-capacity channel is inherently wideband, time-based measurements account for variations across the entire spectrum with a single measurement. Time-based measurements offer more direct comparison to operational performance and align well with manufacturing test philosophies.

In general, the relative advantage of these measurement approaches depends upon the user's test objectives in terms of cost, time, and/or complexity of data output.

Also note that many systems (WDM and broadband, for instance) can be considered both channelized and baseband and can therefore take advantage of multiple measurement approaches.



Among communications device and system manufacturers, time-based or time-domain measurements of digital waveforms have become standard.

These measurements can be thought of as outward manifestations of underlying spectral characteristics.

The most common are:

Bit Error Ratio - a black box approach to quality measurement via bit-bybit comparison

Eye diagrams - a method of investigating pulse transmission amplitude characteristics

Timing jitter - a time-domain alternative to phase noise measurement for investigating pulse transmission timing characteristics



Because high-rate baseband systems are inherently wideband, it is important to fully exercise the system componentry by simulating the wide spectral content of real communication streams.

Communications test equipment typically generates pseudo-random bit sequences containing every possible sequence of N bits for this purpose.

Key spectral features of the PRBS are determined by the bit rate, the pattern length, and the basic pulse shape.



Bit Error Rate Testers (BERTs) use hardware shift registers to generate PRBS streams and to provide an auto-syncing analysis path for the returned data.

For stressing clock recovery circuits, BERTs offer patterns with userdefinable mark densities or long mark sequences.

To test network elements with desired control responses to frame or packet overhead, BERTs may offer integrated protocol generation/analysis.

BERTs may also offer memory access for completely user-defined patterns



In general terms, bit error rate is a black box measurement of transmission quality used to verify the performance of key elements of a transport link. BER measurements can be used to verify both electrical and optical transmitter or receiver hardware as well as to quantify the influence of intermediate elements such as amplifiers, switching fabrics, or even the transmission lines themselves.

Bit errors can be caused by hardware faults in network element line cards; jitter, wander and other types of noise which may or may not be pattern dependent; or signal degradation due to transmission effects such as attenuation, dispersion, or crosstalk.

Errors can be corrected on-the-fly and transparent to the user via software or hardware detection/correction algorithms that may rely on retransmission of data or piggyback coding (FEC or ATM HEC, for instance) for data regeneration.



Bit marks (zero or one) are determined by the analyzer's decision threshold and the signal level at the sampling instant

As signal-to-noise ratio (SNR) decreases, the sampled signal will more frequently fall out of its intended mark range resulting in an error.

The probability of such errors can be calculated based on the statistics associated with the accumulated noise.

These statistics may vary with the average amplitude at the sampling instant and where that sampling instant falls within the bit period or unit interval due to the dependence of noise in some components (shot noise in APDs, for instance) upon signal level and the presence of jitter.



The bit error rate is the experimental measure of the error probability based on the accumulated errors within a statistically valid sample of bits. Commonly, a statistically valid sample is taken to be the number of bits accumulated to obtain 100 errors. In general, one can calculate the minimum test time required to verify a maximum BER based on the bit rate and a statistically valid number of errors N.

For lower bit rates, the single-shot test time (for instance, to verify less than 1e-9 BER at 155 Mb/s using 100 errors requires over 10 minutes) may be unacceptable. In these situations, performance is typically extrapolated from measurements taken under artificially degraded signalto-noise conditions.

System sensitivity is typically defined as the minimum average signal level (average of 1 and 0) at which a maximum BER is achieved. This would be the average electrical amplitude or the average optical power depending upon where in the system the sensitivity is being measured.

Sensitivity typically degrades somewhat with higher data rate designs simply due to the intrinsically wider noise bandwidth.



In a non-return-to-zero data stream, either mark is generated by maintaining the associated signal amplitude during the entire unit interval excepting the unavoidable rise and fall times when the mark changes. In this data format, the average signal level is simply half the sum of the levels associated with the marks. As it is possible to get strings of ones or zeroes without intervening transitions (for example, where the dashed UI boundaries are shown above), clock recovery circuits must be designed with this in mind.

Given the speed of electromagnetic waves in optical fiber or coaxial cable, the 25 ps UI associated with a 40 Gb/s stream extends over roughly 5 mm. Over a typical optical transmission distance of a few km, even thermal material changes may be significant contributors to signal jitter or wander as percentage of UI.

In the frequency domain, an ideal 1010 pattern appears as an infinite comb of odd harmonics of the fundamental frequency which is one-half of the bit rate. Thus, the spectrum of a 40 Gb/s 1010 pattern contains components at 20, 60, 100 GHz and so on. When the data is pseudo-randomized, each component becomes a lobe containing multiples of the pattern repetition frequency.

Real systems operate with bandwidth limitations which generate longer transition times between bits as a result of the loss of high-frequency spectral components. With a 40 Gb/s data stream, even a relatively heroic bandwidth of 200 GHz results in 0.1 UI of transition time.



Return-to-zero format relies on intra-symbol transitions during mark ones to improve clock recovery. One drawback to this format is the minimum system bandwidth required: while a designer can get away with a single spectral lobe and still reproduce some semblance of a pulse (albeit with significant transition times), that lobe is twice as wide in an RZ scheme simply due to the shorter pulse widths.

Nevertheless, RZ formatting has significant advantages for long-haul transmission including:

lower average transmitter power than NRZ for the same mark one amplitude (specifically, half the optical power or a fourth of the electrical power);

lower power experiences higher gain in optical amplifiers which are typically operated in saturation;

short pulses minimize degradation due to chromatic dispersion;

short pulses lend themselves to optical soliton transmission techniques.

## Electronics at 40 Gb/s

|                                     | Si                               | SiGe                        | GaAs                             | InP                                   |
|-------------------------------------|----------------------------------|-----------------------------|----------------------------------|---------------------------------------|
| History                             | Many Years,<br>Many<br>foundries | A few years,<br>3 foundries | Many years,<br>Many<br>foundries | On the edge,<br>some research<br>labs |
| Technology<br>ft today(3ys)         | 20(50)GHz                        | 100(250)GHz                 | 100(600)GHz                      | 150(600)GHz                           |
| Complexity                          | high                             | Low to medium               | low                              | low                                   |
| Reliability                         | proven                           | proven                      | proven                           | 'in doubt'                            |
| Commercial                          | low                              |                             | x10                              |                                       |
| Electron Mobility<br>(cm^2/V-s) @RT | 1350                             | 2000-3000                   | 8500                             | 4000                                  |

#### Compound semiconductor devices benefit from higher electron mobilities

Page 26

Agilent Technologies



The types of wideband amplifiers used for high data-rate applications are equivalent in design to those that have been used for years in aerospace/defense applications. Unlike standard low-noise RF amplifiers, those used for high-speed digital communications require very low lowfrequency cutoffs in order to support random data streams and higher saturated output powers to drive optical modulators.

Higher-speed (microwave/mm-wave) amplifier technologies require DC blocking or AC coupling capacitors at their inputs and outputs.

These components serve to protect the sensitive transistor technologies employed in these types of amplifiers from substantial DC or lowfrequency transient currents;

isolate the amplifier stages from external impedances, thereby holding performance over a variety of excitation/load conditions; and protect the available signal gain from DC saturation.

Unfortunately, AC coupling limits available locations in the system architecture for implementing adjustable decision thresholds which are required for detailed signal quality measurements



CATV and telco vendors have relied on very high-speed direct (current) modulation of distributed feedback (DFB) lasers for lower data rate systems, but this technique is limited by the inherent frequency chirp associated with current injection of semiconductor lasers and the internal dynamics of the lasing medium to around 10 GHz.

The modulation bandwidth demands of 25 ps pulses requires the use of external modulators such as LiNbO3 Mach-Zehnder modulators (MZMs) employing the electro-optic effect which have potential bandwidths in the 100s of GHz.

The highest-speed modulators rely on velocity-matched traveling-wave designs where the coupling between the microwave and optical fields is maximized by using waveguide geometries such that the two waves travel through the device at nearly the same speed.



Given the raw speed of real-time sampling A/Ds, capturing data streams at OC-48 and above necessitates the kind of sequential or random repetitive sampling architectures used in high-speed digital communications analyzers (DCAs).

However, at OC-768, even these types of measuring instruments may be somewhat limited when it comes to measuring the very highest speed waveforms, such as RZ patterns due to insufficient detection bandwidth. Minimum transition times and internal timebase jitter must be considered when evaluating DCA measurements.

Higher-speed optical sampling techniques have been developed using nonlinear optical materials at various laboratories.

As with electronics parts, high-speed photodetector designs rely on high electron mobility compound semiconductors and designs.



While the inherent timebase jitter in a high-speed DCA may be sufficiently small for waveform analysis, it will not generally allow for accurate measurement of the random jitter associated with the signal. This can be improved using available timebase enhancement modules.

For in-depth analysis of jitter issues, it may be necessary to use a phase noise measurement system which can isolate the timing component of system noise from the amplitude component.









- Let's begin by discussing specifications for communication devices. In the communication industry, ratios, probabilities, retries, and retransmits are facts of life since the industry realizes that no complex system will work with zero errors unless you are prepared to spend an infinite amount of money.
- Therefore, many of the measurements are based on Bit Error Ratios (BER) and BER Threshold limits. Designers of chips in other industries don't deal with such ratios, but use absolute maximum and minimum measured time values instead.

Realizing that, let's define some timing measurements.

1) Optimum Sampling Point - is the middle of the data cycle, or 50% point

- this is the point that provides the most margin from errors due to non-rectangular or jittery waveform shapes
- most clock recovery circuits adjust the device sampling clocks to this time; however:
- devices that use an external clock input may not set 50% as "optimum" because the designer is aware of internal device delays
- 2) Phase Margin is the width of the area of the cycle where the BER is better than the specified Error Threshold limit
- if the rise and fall times are infinite and there is no jitter, then phase margin equals the period
- as the clock frequency approaches the device's operating limits, the phase margin decreases because the transition times and jitter become a significant portion of the period
- 3) Skew is the deviation of the 50% point from the average over all bus lines
- 4) Clock-to-Data min/max -- is the actual time from the clock edge to the Vout<sub>High</sub> or Vout Low<sub>Low</sub> of the channel.
- Latency/delay from the clock edge until the output reaches the specific high or low voltage levels
- these are usually equal to the phase margin



Assuming the amplitude noise associated with the measured signal to be Gaussian in nature, the Q-factor is the delta between mark one and mark zero divided by the sum of the standard deviations of the associated noise contributions. Like other "quality" factor definitions, it is similar to a signal-to-noise ratio, but is measured in terms of signal amplitude rather than power.

One can show that if the decision threshold is optimized to minimize BER, the probability of a bit error becomes proportional to the complementary error function of the Q-factor. Since Q is easily related to received signal and noise power, this expression can be used to calculate system sensitivity.

In order to determine the optimum sampling instant and decision threshold, Q as a function of sampling instant can be measured experimentally using histograms of PRBS eye diagrams on a DCA.

Alternatively, one can derive signal and noise parameters from measured BER as the decision threshold is adjusted into the noise tails and use these directly to calculate the optimum sampling conditions.

### Bath Tub Measurement



The bathtub-shaped plot shown on this slide represents the bit error rate versus time. This plot is used for the evaluation of timing measurements.

Let's talk about this "bathtub" plot in more general terms first. The general shape look similar to an eye opening. The middle of the dip where the errors are low is clearly the opening and the right and left edges where the errors reach their maximum is clearly outside of the eye.

An "ideal eye" would be rectangular, but that is impossible to attain, especially as the operating clock frequency increases. The slope of the edges is proportional to the rise and fall times of the waveform. The "width" of the edges is related to the jitter.

If you allow the measurements to run for a long period of time, you would probably see the edges of the tub getting wider like you would on a scope in infinite persistence. The width of the edges would represent the "jitter."

If the jitter is random the result is represented as RMS, otherwise it would be peak-topeak.

The Error Threshold is the BER value beyond which the device would be out-of-spec.



Differential technology is very widely used together with MUX/DEMUX circuitry. Differential means that there is an Output/Input together with complementary Input/Output. The complementary one always deals with the logical opposite of the normal one. So there will never happen same logical level on both, they always will be opposite or complementary (as long as nothing is broken).

The DUT input provides termination for eliminating reflections. This ensures clean signals.

The generator outputs providing the stimulus signals for the DUT are AC coupled. This prevents the application of harmful DC voltages to the DUT. The bandwidth resulting from AC coupling should be low enough so that all low frequency components appearing within the transmitted pattern run through. Practice has proven that a bandwidth of <300kHz is sufficient.

The analyzer connected to the DUT offers differential inputs. This is the right method to detect the proper bit change at the crossing point of the two input signals. It is essential to keep the connection length of both signal lines between DUT and Analyzer input at identical length to maintain the same phase relation at the tester input. The tester input is also terminated together with the AC coupled input.







Local Area Networks Gigabit Ethernet, Fiber Channel, Infiniband, Storage Area Networks Serial speed - 2.5 / 3.2Gb/S MUX 1:10, 1:4

Global Networks Terrestrial, SONET/SDH, Undersea Serial speed - 40Gb/s MUX 1:4, 1:16

Digital Video HDTV, Display Links, Flat Panel Serial speed - 1.5Gb/S MUX 1:7

Components Backplane serializer / de-serializer, Multiple serial, E/O, O/E Serial speed - various MUX 1:4, 1:16



### SONET frame

SONET (Synchronous Optical NETwork) is a standardized technology for digital communication. It allows the carrying of multiple signals of different capacities through a synchronous, flexible, optical hierarchy.

This is achieved through byte-wise multiplexing, which supports end-to-end network management. This slide shows the basic or lowest level for (byte-) multiplexing, which consists of 9 rows and 90 columns of bytes, transmitted from left to right and top to bottom, starting with the MSB.

SONET signals are transmitted in a frame structure. In addition to the data payload, the SONET frame includes a header, also known as the transport overhead. This overhead includes data such as the destination address, redundancy check data, etc.

The standard transmission rate (OC-1 (Optical Carrier) or STS-1 (Synchronous Transport Signal)) of this technology is 51.840Mb/s.



## SONET muxing

This slide illustrates byte-wise multiplexing from STS-1 to STS-3, raising the data rate from 51.84 MB/s to 155.52 Mb/s. This method starts with the first byte of the first frame, then takes the first byte of the second frame, etc.

The table at the bottom of this slide lists the resulting STS-n/OC-n (SDH) line rates.

### SDH

SONET was standardized by ANSI in 1984. In 1989, CCITT issued the SDH (Synchronous Digital Hierarchy) standard, which includes SONET as a subset. As progress was made in terms of line rates, SDH-1 was defined as the equivalent to STS-3.



ITU-T G.709: This standard provides requirements for the Optical Transport Hierarchy (OTH) signals at the Network Node Interface in terms of

Definition of an Optical transport module of the order n (OTM-n)

Structures for an OTM

Formats of mapping and multiplexing client signals

Functionality of overhead

We will now discuss how a client payload maps into an optical transport unit and how this mapping (generating the overhead) plus Forward Error Correction change the necessary line rates, particularly because these are different for (the equivalent of) "OC48, OC192" and "OC768".

### Optical Transport Unit-k (OTU-k)

The OTU basically represents that layer before the signal is presented to an "optical modulation unit" for a given wavelength I, which means data manipulation has been done, including the generation of all necessary overhead, FEC and scrambling.

As for SONET, the basic structure of the OTU is a frame consisting of rows and columns. Unlike SONET, the OTU features only 4 rows and 255x16=4080 columns instead, which are transmitted in the same sense as the SONET frame, i.e. frame from left to right and top to bottom, starting with the MSB.

The OTU contains 1x16 columns (starting with column 1) of ODU/OTU-k overhead, followed by 16x16=256 columns of FEC.



Reduce required bandwidth:

Interchannel interference can be compensated through the use of error correction allowing channels to be closer together.

Increased throughput:

With error correction fewer re-tries are necessary for transmission of the data.

Reducd transmitter power:

The effective sensitvity of system is increased.

Increase Range:

The effective sensitvity of system is increased.

Reduced the required noise figure of the reciever:

With error correction a higher BER can be tolerated and still maintain the same effective BER with the use of error correction.

Flexble BERT desgin:

Allows for testing of various codin schemes. The R&D engineer can make trade-offs between FEC overhead, bandwidth and BER.



Things to keep in mind when testing @ 40Gb/s

Cable length:

Good quality cable can have 7 dB of loss for a 24" length

**Differential Drive:** 

If cables are not matched some means of alligning the DATA and DATA/ is needed

**Receiver:** 

Will the unit operate with single ended signal

**Differential Receiver:** 

If cables are not matched some means of alligning the DATA and DATA/ is needed



The following slides present a proposal for a test system architecture designed to handle data rates of up to 43.2 Gb/s.

Let's begin by looking at the multiplexer architecture inside the DUT.

This slide provides a block diagram of an implementation of a 43Gb/s MUX chip:

This setup uses an internal oscillator. Alternatively, an external clock could be applied.

The clock is double pumped, running at 20GHz.

- 3. This clock drives the final MUX stage as well as (divided by 8) the intermediate buffer stage.
- 4. This divided clock is also provided to the low speed data input port through a phase shifter. This phase shifter can be used to compensate the delay through the loop into the ASIC providing the data for proper setup and hold times.
- 5. The input buffer latches the data from the framer ASIC.



Now let's take a look at the setup for testing a 43Gb/s MUX chip.

We need stimulus signals (on the left side) as well as analyzing capabilities. Let's look at the stimulus first:

- 1. 16 Generators provide the data at 2.7Gb/s which the MUX chip puts together.
- 2. A synthesizer provides the clock for running the data generation and the DUT.

The analysis needs to be done on the 43Gb/s signal.

1. The DUT (shown in blue) is connected to a BERT analyzer which first performs CDR (Clock-Data-Recovery) and then de-muxes the high speed data into16 x 2.7Gb/s data.

2. The data processing and error checking is performed.



PRWS stands for: Pseudo Random Word Sequence.

This sounds similar to PRBS, except for "word." Sounds like it might have something to do with parallel – and It does!

PRWS is the magic algorithmic parallel pattern that results in the appropriate PRBS after a MUX operates on it. Additionally, you could expect that a DEMUX would convert a PRBS into the appropriate PRWS.

As expected, parallel BERTs can create parallel "words" with two parameters – number of bits in the word and the PRBS polynomial we want to see on the output.

Let's see how it works!

If you look at our 4-to-1 MUX/DEMUX link, you would expect that the serial line should have the PRBS we want to test our devices with since this is the serial line that connects into the actual network.

Remember that the 4:1 MUX ratio means that data coming out of the MUX is clocked at a rate 4 times faster than the input data is being latched. This is represented by bit colors and numbered words on the left.

The slide shows that the PRBS is the result of the multiplexing action – latch at rate f, then shift out a rate 4f.

The PRWS is like the PRBS, except that PRWS is sliced into segments – words – whose length expresses the MUX/DEMUX ratio. To keep things simple, the parallel BERT generates the proper sequence of words that will create the proper PRBS.

On the other hand, if another digital test system cannot create this special word sequence, then that system cannot stimulate or analyze these devices realistically.



This block diagram shows a DEMUX module architecture.

A 1:2 DEMUX makes up the first stage of this module. The CDR is located downstream of this DEMUX, so the recovered clock runs at 20GHz. The divided recovered clock drives a second 2:16 DEMUX stage.



This slide shows the setup for testing a DEMUX.

Again, the stimulus side is shown on the left and the analysis side is shown on the right.

Stimulus side

1. One 43Gb/s data stream is supplied to the DUT

2. The MUX module receives a clock signal from a synthesizer for best clock performance. The clock signal is divided to provide a reference for the 16 data generators.

3. The DEMUX receives the data stream. It does not need any clock as it has its own internal CDR.

The DEMUX chip puts out 16 data streams at 2.7Gb/s together with a recovered clock.

Analysis side

The analyzer consists of 17 channels, 16 of which are used for BER testing, while the 17th analyzer measures the clock timing so that the output timing measurement can be performed.



SFI-5 Description



For a SONET Line card the BERT must be able provide optical signals at the 40GB/s rate as well as receive optical signals at that rate.

The optical signal generation is usually done with an external modulator.

Depending on the line card the coding scheme can be non-return-to-zero (NRZ) or return-to-zero (RZ). Typically NRZ is used in short haul networks and RZ used in long haul networks.

A flexible BERT should be able to provide both types of coding schemes to allow for testing of various types of line cards.



The illustrated solution would be great for manufactures of BERTs. Think of the number of units that would have to be bought and the expense.

An alternative is to use a single BER Generator and two modulators. Alternate optical channels would be modulated with DATA and DATA/.

On the receive side a tunable optical fileter would be used to select the channel to be measured and sent to a single BER analyzer.

This setup is illustrated in the following slide.



The illustrated solution would be great for manufactures of BERTs. Think of the number of units that would have to be bought and the expense.

An alternative is to use a single BER Generator and two modulators. Alternate optical channels would be modulated with DATA and DATA/.

On the receive side a tunable optical fileter would be used to select the channel to be measured and sent to a single BER analyzer.

This setup is illustrated in the following slide.



Highly flexible BERT design allows for testing MUX, DMUX, in line components in any combination of of Electrical or Optical.

The required data stream is tapped off and measured.

# Conclusion

- The challenges involved in testing 40Gb/s devices <u>can</u> be met with existing technology
- Improvement in technology will allow quicker and more accurate characterization of devices
- Flexible BERTs allow for ease in testing of various device configuration
- Memory based BERTs allow for simulation of protocol for testing of many devices



**Agilent Technologies** 

